Skip to content

RFC: Add incremental encaps API to support ML-KEM Braid#1619

Draft
mkannwischer wants to merge 2 commits into
mainfrom
incremental-enc-api
Draft

RFC: Add incremental encaps API to support ML-KEM Braid#1619
mkannwischer wants to merge 2 commits into
mainfrom
incremental-enc-api

Conversation

@mkannwischer

Copy link
Copy Markdown
Contributor

Split ML-KEM encapsulation into two phases (mlk_kem_enc_derand_u / mlk_kem_enc_v) to support protocols like Braid that need to interleave encapsulation with other operations between computing the u- and v-components of the ciphertext. The first phase only requires the public seed and H(pk), not the full public key vector. Internally, K-PKE.Encrypt is refactored into mlk_indcpa_enc_u + mlk_indcpa_enc_v. The non-incremental KEM path calls mlk_indcpa_enc directly to avoid serialization overhead. The intermediate noise polynomial epp is serialized as 4-bit nibbles (128 bytes) - this is primarily done to not require a pre-condition on the allowed values.

@mkannwischer mkannwischer force-pushed the incremental-enc-api branch 2 times, most recently from 325ab51 to 285fc8a Compare March 12, 2026 05:37
@mkannwischer mkannwischer added the benchmark this PR should be benchmarked in CI label Mar 12, 2026

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i)

Details
Benchmark suite Current: e9ea411 Previous: 4482a37 Ratio
ML-KEM-512 keypair 11853 cycles 11776 cycles 1.01
ML-KEM-512 encaps 13123 cycles 13065 cycles 1.00
ML-KEM-512 decaps 17004 cycles 17065 cycles 1.00
ML-KEM-768 keypair 19392 cycles 19373 cycles 1.00
ML-KEM-768 encaps 20761 cycles 20701 cycles 1.00
ML-KEM-768 decaps 26367 cycles 26559 cycles 0.99
ML-KEM-1024 keypair 27984 cycles 28108 cycles 1.00
ML-KEM-1024 encaps 30458 cycles 30353 cycles 1.00
ML-KEM-1024 decaps 37829 cycles 37481 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ppc64le (POWER10) benchmarks

Details
Benchmark suite Current: e9ea411 Previous: 4482a37 Ratio
ML-KEM-512 keypair 37821 cycles 38203 cycles 0.99
ML-KEM-512 encaps 43056 cycles 43487 cycles 0.99
ML-KEM-512 decaps 53357 cycles 53659 cycles 0.99
ML-KEM-768 keypair 66725 cycles 66779 cycles 1.00
ML-KEM-768 encaps 75736 cycles 75888 cycles 1.00
ML-KEM-768 decaps 89918 cycles 90189 cycles 1.00
ML-KEM-1024 keypair 108567 cycles 114656 cycles 0.95
ML-KEM-1024 encaps 118737 cycles 125597 cycles 0.95
ML-KEM-1024 decaps 137142 cycles 144918 cycles 0.95

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a)

Details
Benchmark suite Current: e9ea411 Previous: 4482a37 Ratio
ML-KEM-512 keypair 14465 cycles 14441 cycles 1.00
ML-KEM-512 encaps 15942 cycles 15934 cycles 1.00
ML-KEM-512 decaps 21395 cycles 21336 cycles 1.00
ML-KEM-768 keypair 23773 cycles 23760 cycles 1.00
ML-KEM-768 encaps 25115 cycles 25216 cycles 1.00
ML-KEM-768 decaps 33011 cycles 32943 cycles 1.00
ML-KEM-1024 keypair 33471 cycles 33378 cycles 1.00
ML-KEM-1024 encaps 35824 cycles 35867 cycles 1.00
ML-KEM-1024 decaps 46179 cycles 46149 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'AMD EPYC 3rd gen (c6a)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: a4e4e31 Previous: 2bf8e59 Ratio
ML-KEM-512 encaps 16707 cycles 15974 cycles 1.05
ML-KEM-768 decaps 35711 cycles 33345 cycles 1.07
ML-KEM-1024 decaps 50650 cycles 46735 cycles 1.08

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i) (no-opt)

Details
Benchmark suite Current: e9ea411 Previous: 4482a37 Ratio
ML-KEM-512 keypair 27685 cycles 27661 cycles 1.00
ML-KEM-512 encaps 34281 cycles 35679 cycles 0.96
ML-KEM-512 decaps 43870 cycles 44306 cycles 0.99
ML-KEM-768 keypair 44503 cycles 44223 cycles 1.01
ML-KEM-768 encaps 54127 cycles 55144 cycles 0.98
ML-KEM-768 decaps 66760 cycles 68201 cycles 0.98
ML-KEM-1024 keypair 68040 cycles 67847 cycles 1.00
ML-KEM-1024 encaps 80112 cycles 78956 cycles 1.01
ML-KEM-1024 decaps 97058 cycles 96217 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a)

Details
Benchmark suite Current: e9ea411 Previous: 4482a37 Ratio
ML-KEM-512 keypair 12825 cycles 12868 cycles 1.00
ML-KEM-512 encaps 14241 cycles 14232 cycles 1.00
ML-KEM-512 decaps 18988 cycles 18980 cycles 1.00
ML-KEM-768 keypair 21586 cycles 21563 cycles 1.00
ML-KEM-768 encaps 22718 cycles 22720 cycles 1.00
ML-KEM-768 decaps 29794 cycles 29740 cycles 1.00
ML-KEM-1024 keypair 30629 cycles 30507 cycles 1.00
ML-KEM-1024 encaps 32695 cycles 32653 cycles 1.00
ML-KEM-1024 decaps 41956 cycles 41954 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'AMD EPYC 4th gen (c7a)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: a4e4e31 Previous: 2bf8e59 Ratio
ML-KEM-512 keypair 13236 cycles 12779 cycles 1.04
ML-KEM-512 encaps 15642 cycles 14273 cycles 1.10
ML-KEM-768 decaps 32957 cycles 30058 cycles 1.10
ML-KEM-1024 keypair 34340 cycles 32987 cycles 1.04
ML-KEM-1024 decaps 47071 cycles 42393 cycles 1.11

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i)

Details
Benchmark suite Current: e9ea411 Previous: 4482a37 Ratio
ML-KEM-512 keypair 17921 cycles 17866 cycles 1.00
ML-KEM-512 encaps 20012 cycles 20054 cycles 1.00
ML-KEM-512 decaps 26695 cycles 26609 cycles 1.00
ML-KEM-768 keypair 32510 cycles 29935 cycles 1.09
ML-KEM-768 encaps 31503 cycles 32680 cycles 0.96
ML-KEM-768 decaps 41528 cycles 41409 cycles 1.00
ML-KEM-1024 keypair 42162 cycles 42984 cycles 0.98
ML-KEM-1024 encaps 46821 cycles 45073 cycles 1.04
ML-KEM-1024 decaps 60681 cycles 58039 cycles 1.05

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Intel Xeon 3rd gen (c6i)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: e9ea411 Previous: 4482a37 Ratio
ML-KEM-768 keypair 32510 cycles 29935 cycles 1.09
ML-KEM-1024 encaps 46821 cycles 45073 cycles 1.04
ML-KEM-1024 decaps 60681 cycles 58039 cycles 1.05

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a) (no-opt)

Details
Benchmark suite Current: e9ea411 Previous: 4482a37 Ratio
ML-KEM-512 keypair 40201 cycles 40213 cycles 1.00
ML-KEM-512 encaps 48542 cycles 48370 cycles 1.00
ML-KEM-512 decaps 62176 cycles 62174 cycles 1.00
ML-KEM-768 keypair 62641 cycles 62566 cycles 1.00
ML-KEM-768 encaps 74682 cycles 74803 cycles 1.00
ML-KEM-768 decaps 92226 cycles 92304 cycles 1.00
ML-KEM-1024 keypair 94939 cycles 95014 cycles 1.00
ML-KEM-1024 encaps 109951 cycles 110210 cycles 1.00
ML-KEM-1024 decaps 132335 cycles 132320 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a) (no-opt)

Details
Benchmark suite Current: e9ea411 Previous: 4482a37 Ratio
ML-KEM-512 keypair 36774 cycles 36792 cycles 1.00
ML-KEM-512 encaps 42794 cycles 42829 cycles 1.00
ML-KEM-512 decaps 55540 cycles 55590 cycles 1.00
ML-KEM-768 keypair 57956 cycles 58040 cycles 1.00
ML-KEM-768 encaps 66815 cycles 66873 cycles 1.00
ML-KEM-768 decaps 83502 cycles 83563 cycles 1.00
ML-KEM-1024 keypair 88561 cycles 88564 cycles 1.00
ML-KEM-1024 encaps 99020 cycles 99066 cycles 1.00
ML-KEM-1024 decaps 120497 cycles 120649 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A76 (Raspberry Pi 5) benchmarks

Details
Benchmark suite Current: e9ea411 Previous: 4482a37 Ratio
ML-KEM-512 keypair 28243 cycles 28235 cycles 1.00
ML-KEM-512 encaps 34131 cycles 34092 cycles 1.00
ML-KEM-512 decaps 44543 cycles 44505 cycles 1.00
ML-KEM-768 keypair 47617 cycles 47612 cycles 1.00
ML-KEM-768 encaps 53735 cycles 53774 cycles 1.00
ML-KEM-768 decaps 68350 cycles 68448 cycles 1.00
ML-KEM-1024 keypair 70247 cycles 70164 cycles 1.00
ML-KEM-1024 encaps 78545 cycles 78671 cycles 1.00
ML-KEM-1024 decaps 98249 cycles 98323 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4

Details
Benchmark suite Current: e9ea411 Previous: b13df87 Ratio
ML-KEM-512 keypair 17672 cycles 17688 cycles 1.00
ML-KEM-512 encaps 20575 cycles 20580 cycles 1.00
ML-KEM-512 decaps 27019 cycles 27041 cycles 1.00
ML-KEM-768 keypair 29855 cycles 29884 cycles 1.00
ML-KEM-768 encaps 32645 cycles 32622 cycles 1.00
ML-KEM-768 decaps 41842 cycles 41908 cycles 1.00
ML-KEM-1024 keypair 43752 cycles 43719 cycles 1.00
ML-KEM-1024 encaps 48716 cycles 48697 cycles 1.00
ML-KEM-1024 decaps 61322 cycles 61364 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i) (no-opt)

Details
Benchmark suite Current: e9ea411 Previous: 4482a37 Ratio
ML-KEM-512 keypair 47333 cycles 46788 cycles 1.01
ML-KEM-512 encaps 55761 cycles 55610 cycles 1.00
ML-KEM-512 decaps 71489 cycles 71256 cycles 1.00
ML-KEM-768 keypair 73869 cycles 74467 cycles 0.99
ML-KEM-768 encaps 86607 cycles 86021 cycles 1.01
ML-KEM-768 decaps 107279 cycles 107222 cycles 1.00
ML-KEM-1024 keypair 111424 cycles 111541 cycles 1.00
ML-KEM-1024 encaps 126627 cycles 126103 cycles 1.00
ML-KEM-1024 decaps 152476 cycles 152172 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4 (no-opt)

Details
Benchmark suite Current: e9ea411 Previous: b13df87 Ratio
ML-KEM-512 keypair 35229 cycles 35280 cycles 1.00
ML-KEM-512 encaps 39983 cycles 40023 cycles 1.00
ML-KEM-512 decaps 50731 cycles 50626 cycles 1.00
ML-KEM-768 keypair 56685 cycles 56804 cycles 1.00
ML-KEM-768 encaps 63645 cycles 64047 cycles 0.99
ML-KEM-768 decaps 78117 cycles 78489 cycles 1.00
ML-KEM-1024 keypair 87304 cycles 87373 cycles 1.00
ML-KEM-1024 encaps 97083 cycles 96672 cycles 1.00
ML-KEM-1024 decaps 115151 cycles 114712 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3

Details
Benchmark suite Current: e9ea411 Previous: b13df87 Ratio
ML-KEM-512 keypair 18670 cycles 18678 cycles 1.00
ML-KEM-512 encaps 21800 cycles 21864 cycles 1.00
ML-KEM-512 decaps 28786 cycles 28823 cycles 1.00
ML-KEM-768 keypair 31528 cycles 31567 cycles 1.00
ML-KEM-768 encaps 34667 cycles 34683 cycles 1.00
ML-KEM-768 decaps 44698 cycles 44777 cycles 1.00
ML-KEM-1024 keypair 46196 cycles 46125 cycles 1.00
ML-KEM-1024 encaps 51503 cycles 51493 cycles 1.00
ML-KEM-1024 decaps 64986 cycles 64964 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2

Details
Benchmark suite Current: e9ea411 Previous: b13df87 Ratio
ML-KEM-512 keypair 28204 cycles 28236 cycles 1.00
ML-KEM-512 encaps 34133 cycles 34084 cycles 1.00
ML-KEM-512 decaps 44506 cycles 44432 cycles 1.00
ML-KEM-768 keypair 47615 cycles 47625 cycles 1.00
ML-KEM-768 encaps 53749 cycles 53803 cycles 1.00
ML-KEM-768 decaps 68333 cycles 68517 cycles 1.00
ML-KEM-1024 keypair 70167 cycles 70259 cycles 1.00
ML-KEM-1024 encaps 78556 cycles 78648 cycles 1.00
ML-KEM-1024 decaps 98212 cycles 98319 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3 (no-opt)

Details
Benchmark suite Current: e9ea411 Previous: b13df87 Ratio
ML-KEM-512 keypair 38828 cycles 38908 cycles 1.00
ML-KEM-512 encaps 44405 cycles 44381 cycles 1.00
ML-KEM-512 decaps 56276 cycles 56214 cycles 1.00
ML-KEM-768 keypair 62258 cycles 62498 cycles 1.00
ML-KEM-768 encaps 70243 cycles 70439 cycles 1.00
ML-KEM-768 decaps 86192 cycles 86364 cycles 1.00
ML-KEM-1024 keypair 95777 cycles 95880 cycles 1.00
ML-KEM-1024 encaps 106465 cycles 106092 cycles 1.00
ML-KEM-1024 decaps 126189 cycles 125748 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2 (no-opt)

Details
Benchmark suite Current: e9ea411 Previous: b13df87 Ratio
ML-KEM-512 keypair 58745 cycles 58742 cycles 1.00
ML-KEM-512 encaps 68587 cycles 68447 cycles 1.00
ML-KEM-512 decaps 87502 cycles 87352 cycles 1.00
ML-KEM-768 keypair 94987 cycles 94443 cycles 1.01
ML-KEM-768 encaps 109213 cycles 108796 cycles 1.00
ML-KEM-768 decaps 134323 cycles 133769 cycles 1.00
ML-KEM-1024 keypair 151089 cycles 150776 cycles 1.00
ML-KEM-1024 encaps 165508 cycles 165506 cycles 1.00
ML-KEM-1024 decaps 198873 cycles 198660 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot

oqs-bot commented Mar 12, 2026

Copy link
Copy Markdown
Contributor

CBMC Results (ML-KEM-512)

Full Results (198 proofs)
Proof Status Current Previous Change
**TOTAL** 1357s 1490s -8.9%
mlk_indcpa_keypair_derand 247s 254s -3%
mlk_poly_rej_uniform 132s 153s -14%
mlk_rej_uniform_c 125s 140s -11%
mlk_indcpa_enc_u 61s - new
mlk_polyvec_basemul_acc_montgomery_cached_c 50s 59s -15%
poly_ntt_native 40s 41s -2%
mlk_poly_reduce_native 35s 38s -8%
mlk_ntt_layer 33s 37s -11%
mlk_keccak_squeezeblocks_x4 25s 27s -7%
mlk_indcpa_dec 17s 18s -6%
keccakf1600x4_permute_native_x4 16s 16s +0%
mlk_fqmul 15s 16s -6%
mlk_poly_decompress_d10_native 15s 15s +0%
mlk_poly_decompress_d4_native 13s 15s -13%
mlk_polyvec_add 13s 11s +18%
mlk_enc_derand_u 11s - new
mlk_indcpa_enc_v 11s - new
mlk_poly_frommsg 11s 12s -8%
mlk_keccak_squeezeblocks 9s 8s +12%
mlk_poly_ntt 9s 7s +29%
mlk_keccakf1600_permute_c 8s 3s +167%
mlk_keccak_squeeze_once 7s 7s +0%
mlk_ntt_butterfly_block 7s 8s -12%
mlk_poly_compress_d10_native 7s 4s +75%
mlk_poly_frombytes_native 7s 9s -22%
polyvec_basemul_acc_montgomery_cached_native 7s 5s +40%
rej_uniform_native_x86_64 7s 5s +40%
mlk_ct_cmask_nonzero_u8 6s 1s +500%
mlk_poly_cbd_eta2 6s 4s +50%
kem_dec 5s 5s +0%
mlk_enc_v 5s - new
mlk_poly_add 5s 4s +25%
mlk_poly_compress_d10_c 5s 4s +25%
mlk_poly_decompress_d4_c 5s 3s +67%
mlk_poly_ntt_c 5s 4s +25%
mlk_poly_rej_uniform_x4 5s 6s -17%
mlk_polyvec_mulcache_compute 5s 1s +400%
mlk_serialize_epp 5s - new
mlk_shake256x4 5s 3s +67%
nttunpack_native_x86_64 5s 4s +25%
poly_decompress_d4_native_x86_64 5s 5s +0%
poly_decompress_d5_native_x86_64 5s 2s +150%
kem_enc 4s 2s +100%
mlk_deserialize_epp 4s - new
mlk_indcpa_enc 4s 171s -98%
mlk_invntt_layer 4s 6s -33%
mlk_keccak_absorb_once 4s 4s +0%
mlk_keccak_absorb_once_x4 4s 7s -43%
mlk_keccakf1600x4_extract_bytes_c 4s 3s +33%
mlk_poly_decompress_d10 4s 4s +0%
mlk_poly_decompress_dv 4s 3s +33%
mlk_poly_getnoise_eta1_4x 4s 2s +100%
mlk_scalar_decompress_d4 4s 1s +300%
poly_compress_d10_native_x86_64 4s 1s +300%
poly_decompress_d10_native_x86_64 4s 4s +0%
poly_decompress_d11_native_x86_64 4s 3s +33%
poly_frombytes_native_x86_64 4s 6s -33%
rej_uniform_native 4s 3s +33%
sys_check_capability 4s 1s +300%
keccak_f1600_x4_native_aarch64_v84a 3s 3s +0%
keccakf1600_permute_native 3s 2s +50%
mlk_ct_cmov_zero 3s 3s +0%
mlk_ct_sel_uint8 3s 2s +50%
mlk_gen_matrix_serial 3s 2s +50%
mlk_keccakf1600_extract_bytes (big endian) 3s 2s +50%
mlk_keccakf1600_xor_bytes (big endian) 3s 1s +200%
mlk_keccakf1600x4_xor_bytes_c 3s 1s +200%
mlk_poly_compress_d11_native 3s 2s +50%
mlk_poly_compress_d5 3s 3s +0%
mlk_poly_compress_d5_c 3s 1s +200%
mlk_poly_decompress_d5_c 3s 2s +50%
mlk_poly_frombytes_c 3s 1s +200%
mlk_poly_getnoise_eta1122_4x 3s 4s -25%
mlk_poly_reduce 3s 3s +0%
mlk_poly_tomsg 3s 2s +50%
mlk_polyvec_basemul_acc_montgomery_cached 3s 1s +200%
mlk_polyvec_permute_bitrev_to_custom 3s 2s +50%
mlk_polyvec_permute_bitrev_to_custom_native 3s 2s +50%
mlk_scalar_compress_d1 3s 1s +200%
mlk_scalar_compress_d10 3s 1s +200%
mlk_scalar_compress_d4 3s 3s +0%
mlk_scalar_compress_d5 3s 2s +50%
mlk_scalar_decompress_d10 3s 4s -25%
mlk_scalar_decompress_d11 3s 1s +200%
mlk_sha3_256 3s 5s -40%
mlk_sha3_512 3s 1s +200%
mlk_shake128x4_squeezeblocks 3s 1s +200%
mlk_value_barrier_i32 3s 3s +0%
poly_compress_d5_native_x86_64 3s 3s +0%
poly_reduce_native_aarch64 3s 2s +50%
poly_tobytes_native_aarch64 3s 2s +50%
poly_tomont_native_x86_64 3s 3s +0%
polyvec_basemul_acc_montgomery_cached_k4_native_aarch64 3s 2s +50%
intt_native_aarch64 2s 3s -33%
intt_native_x86_64 2s 3s -33%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 3s -33%
keccakf1600x4_extract_bytes_native 2s 3s -33%
kem_check_pk 2s 3s -33%
kem_check_sk 2s 5s -60%
kem_enc_derand 2s 4s -50%
kem_keypair 2s 2s +0%
kem_keypair_derand 2s 3s -33%
mlk_barrett_reduce 2s 1s +100%
mlk_check_pct 2s 2s +0%
mlk_ct_cmask_neg_i16 2s 2s +0%
mlk_ct_sel_int16 2s 3s -33%
mlk_deserialize_polyvec_16le 2s - new
mlk_gen_matrix 2s 3s -33%
mlk_keccakf1600_xor_bytes 2s 3s -33%
mlk_keccakf1600x4_permute 2s 2s +0%
mlk_keccakf1600x4_xor_bytes 2s 1s +100%
mlk_montgomery_reduce 2s 2s +0%
mlk_poly_compress_d11 2s 2s +0%
mlk_poly_compress_d4 2s 1s +100%
mlk_poly_compress_dv 2s 2s +0%
mlk_poly_decompress_d10_c 2s 3s -33%
mlk_poly_decompress_d11_c 2s 3s -33%
mlk_poly_decompress_d11_native 2s 2s +0%
mlk_poly_decompress_d4 2s 2s +0%
mlk_poly_decompress_d5 2s 1s +100%
mlk_poly_decompress_d5_native 2s 1s +100%
mlk_poly_getnoise_eta1_4x_native 2s 4s -50%
mlk_poly_getnoise_eta2 2s 3s -33%
mlk_poly_invntt_tomont 2s 2s +0%
mlk_poly_invntt_tomont_c 2s 2s +0%
mlk_poly_mulcache_compute_c 2s 4s -50%
mlk_poly_reduce_c 2s 2s +0%
mlk_poly_sub 2s 2s +0%
mlk_poly_tobytes_c 2s 2s +0%
mlk_poly_tobytes_native 2s 1s +100%
mlk_poly_tomont 2s 2s +0%
mlk_poly_tomont_native 2s 2s +0%
mlk_polymat_permute_bitrev_to_custom 2s 2s +0%
mlk_polyvec_compress_du 2s 2s +0%
mlk_polyvec_decompress_du 2s 2s +0%
mlk_polyvec_frombytes 2s 4s -50%
mlk_polyvec_invntt_tomont 2s 1s +100%
mlk_polyvec_reduce 2s 3s -33%
mlk_polyvec_tobytes 2s 2s +0%
mlk_scalar_compress_d11 2s 1s +100%
mlk_scalar_decompress_d5 2s 2s +0%
mlk_scalar_signed_to_unsigned_q 2s 5s -60%
mlk_serialize_polyvec_16le 2s - new
mlk_shake128_absorb_once 2s 2s +0%
mlk_shake128_squeezeblocks 2s 2s +0%
mlk_shake128x4_absorb_once 2s 1s +100%
mlk_value_barrier_u32 2s 2s +0%
mlk_value_barrier_u8 2s 2s +0%
ntt_native_aarch64 2s 2s +0%
poly_compress_d4_native_x86_64 2s 3s -33%
poly_getnoise_eta1122_4x_native 2s 3s -33%
poly_invntt_tomont_native 2s 3s -33%
poly_mulcache_compute_native_aarch64 2s 4s -50%
poly_mulcache_compute_native_x86_64 2s 2s +0%
poly_reduce_native_x86_64 2s 1s +100%
poly_tomont_native_aarch64 2s 1s +100%
polyvec_basemul_acc_montgomery_cached_k2_native_aarch64 2s 2s +0%
polyvec_basemul_acc_montgomery_cached_k2_native_x86_64 2s 3s -33%
polyvec_basemul_acc_montgomery_cached_k4_native_x86_64 2s 2s +0%
rej_uniform_native_aarch64 2s 2s +0%
keccak_f1600_x1_native_aarch64 1s 3s -67%
keccak_f1600_x1_native_aarch64_v84a 1s 2s -50%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 1s 5s -80%
keccak_f1600_x4_native_avx2 1s 1s +0%
keccakf1600x4_xor_bytes_native 1s 2s -50%
mlk_ct_cmask_nonzero_u16 1s 3s -67%
mlk_ct_get_optblocker_i32 1s 1s +0%
mlk_ct_get_optblocker_u32 1s 2s -50%
mlk_ct_get_optblocker_u8 1s 2s -50%
mlk_ct_memcmp 1s 2s -50%
mlk_keccakf1600_extract_bytes 1s 2s -50%
mlk_keccakf1600_permute 1s 1s +0%
mlk_keccakf1600x4_extract_bytes 1s 2s -50%
mlk_keypair_getnoise_eta1 1s 2s -50%
mlk_matvec_mul 1s 2s -50%
mlk_poly_cbd_eta1 1s 2s -50%
mlk_poly_compress_d10 1s 2s -50%
mlk_poly_compress_d11_c 1s 1s +0%
mlk_poly_compress_d4_c 1s 3s -67%
mlk_poly_compress_d4_native 1s 3s -67%
mlk_poly_compress_d5_native 1s 4s -75%
mlk_poly_compress_du 1s 2s -50%
mlk_poly_decompress_d11 1s 3s -67%
mlk_poly_decompress_du 1s 4s -75%
mlk_poly_frombytes 1s 2s -50%
mlk_poly_mulcache_compute 1s 2s -50%
mlk_poly_mulcache_compute_native 1s 3s -67%
mlk_poly_tobytes 1s 2s -50%
mlk_poly_tomont_c 1s 1s +0%
mlk_polyvec_ntt 1s 1s +0%
mlk_polyvec_tomont 1s 1s +0%
mlk_rej_uniform 1s 2s -50%
mlk_shake256 1s 4s -75%
ntt_native_x86_64 1s 5s -80%
poly_compress_d11_native_x86_64 1s 2s -50%
poly_tobytes_native_x86_64 1s 3s -67%
polyvec_basemul_acc_montgomery_cached_k3_native_aarch64 1s 1s +0%
polyvec_basemul_acc_montgomery_cached_k3_native_x86_64 1s 4s -75%

@oqs-bot

oqs-bot commented Mar 12, 2026

Copy link
Copy Markdown
Contributor

CBMC Results (ML-KEM-768)

Full Results (198 proofs)
Proof Status Current Previous Change
**TOTAL** 1436s 1400s +2.6%
mlk_indcpa_keypair_derand 206s 185s +11%
mlk_poly_rej_uniform 173s 145s +19%
mlk_rej_uniform_c 146s 127s +15%
mlk_polyvec_basemul_acc_montgomery_cached_c 54s 47s +15%
poly_ntt_native 50s 38s +32%
mlk_ntt_layer 44s 30s +47%
mlk_indcpa_enc_u 41s - new
mlk_poly_reduce_native 41s 35s +17%
mlk_keccak_squeezeblocks_x4 30s 24s +25%
mlk_fqmul 21s 17s +24%
mlk_poly_decompress_d4_native 19s 13s +46%
polyvec_basemul_acc_montgomery_cached_native 18s 20s -10%
keccakf1600x4_permute_native_x4 17s 18s -6%
mlk_poly_decompress_d10_native 17s 16s +6%
mlk_indcpa_dec 16s 12s +33%
mlk_poly_frommsg 14s 9s +56%
mlk_indcpa_enc_v 13s - new
mlk_poly_frombytes_native 12s 9s +33%
mlk_polyvec_add 10s 9s +11%
mlk_ntt_butterfly_block 9s 8s +12%
mlk_invntt_layer 8s 6s +33%
mlk_keccak_squeeze_once 8s 8s +0%
mlk_keccak_squeezeblocks 8s 9s -11%
mlk_keccak_absorb_once_x4 7s 7s +0%
mlk_poly_rej_uniform_x4 7s 4s +75%
keccakf1600x4_extract_bytes_native 6s 3s +100%
mlk_enc_v 6s - new
mlk_gen_matrix_serial 6s 3s +100%
mlk_keccak_absorb_once 6s 6s +0%
mlk_keccakf1600_permute_c 6s 5s +20%
mlk_poly_ntt 6s 8s -25%
poly_decompress_d10_native_x86_64 6s 5s +20%
rej_uniform_native 6s 4s +50%
rej_uniform_native_x86_64 6s 7s -14%
mlk_enc_derand_u 5s - new
poly_decompress_d4_native_x86_64 5s 5s +0%
polyvec_basemul_acc_montgomery_cached_k4_native_aarch64 5s 3s +67%
kem_dec 4s 5s -20%
mlk_gen_matrix 4s 3s +33%
mlk_keccakf1600x4_extract_bytes_c 4s 2s +100%
mlk_poly_compress_d10_c 4s 2s +100%
mlk_poly_compress_d4_native 4s 1s +300%
mlk_poly_compress_dv 4s 2s +100%
mlk_poly_decompress_d5_c 4s 2s +100%
mlk_poly_decompress_du 4s 2s +100%
mlk_poly_decompress_dv 4s 3s +33%
mlk_poly_invntt_tomont_c 4s 3s +33%
mlk_poly_ntt_c 4s 3s +33%
mlk_poly_tobytes_native 4s 3s +33%
mlk_poly_tomont_c 4s 2s +100%
mlk_polyvec_frombytes 4s 2s +100%
mlk_rej_uniform 4s 3s +33%
mlk_scalar_compress_d11 4s 3s +33%
mlk_shake128_squeezeblocks 4s 2s +100%
mlk_shake128x4_squeezeblocks 4s 1s +300%
poly_compress_d5_native_x86_64 4s 4s +0%
poly_frombytes_native_x86_64 4s 4s +0%
poly_mulcache_compute_native_x86_64 4s 2s +100%
polyvec_basemul_acc_montgomery_cached_k3_native_x86_64 4s 2s +100%
keccak_f1600_x1_native_aarch64_v84a 3s 2s +50%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 3s 2s +50%
keccakf1600x4_xor_bytes_native 3s 3s +0%
kem_check_pk 3s 4s -25%
kem_check_sk 3s 1s +200%
kem_enc_derand 3s 3s +0%
kem_keypair 3s 3s +0%
kem_keypair_derand 3s 3s +0%
mlk_check_pct 3s 3s +0%
mlk_ct_cmov_zero 3s 3s +0%
mlk_indcpa_enc 3s 173s -98%
mlk_keccakf1600_extract_bytes (big endian) 3s 2s +50%
mlk_keccakf1600_xor_bytes (big endian) 3s 4s -25%
mlk_keccakf1600x4_permute 3s 1s +200%
mlk_poly_add 3s 3s +0%
mlk_poly_cbd_eta1 3s 3s +0%
mlk_poly_cbd_eta2 3s 1s +200%
mlk_poly_compress_d11_native 3s 1s +200%
mlk_poly_compress_d4_c 3s 4s -25%
mlk_poly_compress_d5_native 3s 3s +0%
mlk_poly_decompress_d11_native 3s 3s +0%
mlk_poly_decompress_d4 3s 3s +0%
mlk_poly_frombytes 3s 2s +50%
mlk_poly_frombytes_c 3s 4s -25%
mlk_poly_getnoise_eta1_4x 3s 3s +0%
mlk_poly_invntt_tomont 3s 2s +50%
mlk_poly_mulcache_compute_c 3s 4s -25%
mlk_poly_sub 3s 4s -25%
mlk_poly_tomsg 3s 3s +0%
mlk_polymat_permute_bitrev_to_custom 3s 2s +50%
mlk_polyvec_compress_du 3s 1s +200%
mlk_polyvec_permute_bitrev_to_custom 3s 2s +50%
mlk_scalar_decompress_d11 3s 1s +200%
mlk_scalar_signed_to_unsigned_q 3s 6s -50%
mlk_serialize_epp 3s - new
mlk_serialize_polyvec_16le 3s - new
mlk_shake256 3s 3s +0%
mlk_shake256x4 3s 3s +0%
mlk_value_barrier_u8 3s 3s +0%
ntt_native_aarch64 3s 3s +0%
nttunpack_native_x86_64 3s 4s -25%
poly_invntt_tomont_native 3s 3s +0%
poly_mulcache_compute_native_aarch64 3s 2s +50%
poly_tomont_native_x86_64 3s 3s +0%
polyvec_basemul_acc_montgomery_cached_k2_native_aarch64 3s 2s +50%
polyvec_basemul_acc_montgomery_cached_k3_native_aarch64 3s 3s +0%
polyvec_basemul_acc_montgomery_cached_k4_native_x86_64 3s 4s -25%
intt_native_aarch64 2s 1s +100%
intt_native_x86_64 2s 3s -33%
keccak_f1600_x1_native_aarch64 2s 1s +100%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 2s 3s -33%
keccak_f1600_x4_native_avx2 2s 2s +0%
keccakf1600_permute_native 2s 1s +100%
kem_enc 2s 1s +100%
mlk_ct_cmask_nonzero_u16 2s 3s -33%
mlk_ct_get_optblocker_i32 2s 3s -33%
mlk_ct_get_optblocker_u32 2s 2s +0%
mlk_ct_get_optblocker_u8 2s 2s +0%
mlk_deserialize_polyvec_16le 2s - new
mlk_keccakf1600x4_extract_bytes 2s 2s +0%
mlk_keypair_getnoise_eta1 2s 2s +0%
mlk_matvec_mul 2s 4s -50%
mlk_poly_compress_d10_native 2s 4s -50%
mlk_poly_compress_d11_c 2s 2s +0%
mlk_poly_compress_d5 2s 3s -33%
mlk_poly_compress_du 2s 2s +0%
mlk_poly_decompress_d11 2s 1s +100%
mlk_poly_decompress_d11_c 2s 2s +0%
mlk_poly_decompress_d4_c 2s 4s -50%
mlk_poly_decompress_d5_native 2s 2s +0%
mlk_poly_getnoise_eta1_4x_native 2s 2s +0%
mlk_poly_getnoise_eta2 2s 3s -33%
mlk_poly_mulcache_compute_native 2s 2s +0%
mlk_poly_reduce_c 2s 2s +0%
mlk_poly_tobytes 2s 3s -33%
mlk_poly_tomont 2s 1s +100%
mlk_poly_tomont_native 2s 1s +100%
mlk_polyvec_decompress_du 2s 3s -33%
mlk_polyvec_invntt_tomont 2s 1s +100%
mlk_polyvec_mulcache_compute 2s 3s -33%
mlk_polyvec_ntt 2s 2s +0%
mlk_polyvec_reduce 2s 2s +0%
mlk_polyvec_tomont 2s 2s +0%
mlk_scalar_compress_d1 2s 2s +0%
mlk_scalar_compress_d10 2s 1s +100%
mlk_scalar_compress_d5 2s 3s -33%
mlk_scalar_decompress_d10 2s 4s -50%
mlk_scalar_decompress_d4 2s 1s +100%
mlk_shake128x4_absorb_once 2s 3s -33%
mlk_value_barrier_i32 2s 3s -33%
mlk_value_barrier_u32 2s 3s -33%
ntt_native_x86_64 2s 2s +0%
poly_compress_d10_native_x86_64 2s 1s +100%
poly_compress_d11_native_x86_64 2s 4s -50%
poly_compress_d4_native_x86_64 2s 2s +0%
poly_decompress_d5_native_x86_64 2s 2s +0%
poly_getnoise_eta1122_4x_native 2s 3s -33%
poly_reduce_native_aarch64 2s 3s -33%
poly_reduce_native_x86_64 2s 3s -33%
poly_tobytes_native_aarch64 2s 2s +0%
poly_tobytes_native_x86_64 2s 2s +0%
polyvec_basemul_acc_montgomery_cached_k2_native_x86_64 2s 3s -33%
rej_uniform_native_aarch64 2s 4s -50%
sys_check_capability 2s 2s +0%
keccak_f1600_x4_native_aarch64_v84a 1s 3s -67%
mlk_barrett_reduce 1s 2s -50%
mlk_ct_cmask_neg_i16 1s 2s -50%
mlk_ct_cmask_nonzero_u8 1s 1s +0%
mlk_ct_memcmp 1s 2s -50%
mlk_ct_sel_int16 1s 2s -50%
mlk_ct_sel_uint8 1s 3s -67%
mlk_deserialize_epp 1s - new
mlk_keccakf1600_extract_bytes 1s 3s -67%
mlk_keccakf1600_permute 1s 2s -50%
mlk_keccakf1600_xor_bytes 1s 3s -67%
mlk_keccakf1600x4_xor_bytes 1s 2s -50%
mlk_keccakf1600x4_xor_bytes_c 1s 1s +0%
mlk_montgomery_reduce 1s 1s +0%
mlk_poly_compress_d10 1s 3s -67%
mlk_poly_compress_d11 1s 3s -67%
mlk_poly_compress_d4 1s 2s -50%
mlk_poly_compress_d5_c 1s 4s -75%
mlk_poly_decompress_d10 1s 1s +0%
mlk_poly_decompress_d10_c 1s 1s +0%
mlk_poly_decompress_d5 1s 1s +0%
mlk_poly_getnoise_eta1122_4x 1s 1s +0%
mlk_poly_mulcache_compute 1s 2s -50%
mlk_poly_reduce 1s 1s +0%
mlk_poly_tobytes_c 1s 5s -80%
mlk_polyvec_basemul_acc_montgomery_cached 1s 1s +0%
mlk_polyvec_permute_bitrev_to_custom_native 1s 3s -67%
mlk_polyvec_tobytes 1s 2s -50%
mlk_scalar_compress_d4 1s 3s -67%
mlk_scalar_decompress_d5 1s 3s -67%
mlk_sha3_256 1s 3s -67%
mlk_sha3_512 1s 1s +0%
mlk_shake128_absorb_once 1s 1s +0%
poly_decompress_d11_native_x86_64 1s 3s -67%
poly_tomont_native_aarch64 1s 2s -50%

@oqs-bot

oqs-bot commented Mar 12, 2026

Copy link
Copy Markdown
Contributor

CBMC Results (ML-KEM-1024)

⚠️ Attention Required

Proof Status Current Previous Change
**TOTAL** ⚠️ 1772s 1380s +28.4%
Full Results (198 proofs)
Proof Status Current Previous Change
**TOTAL** ⚠️ 1772s 1380s +28.4%
mlk_indcpa_enc_u 491s - new
mlk_poly_rej_uniform 144s 159s -9%
mlk_rej_uniform_c 129s 131s -2%
mlk_indcpa_keypair_derand 126s 124s +2%
mlk_polyvec_basemul_acc_montgomery_cached_c 77s 77s +0%
poly_ntt_native 41s 41s +0%
polyvec_basemul_acc_montgomery_cached_native 38s 36s +6%
mlk_poly_reduce_native 35s 34s +3%
mlk_ntt_layer 31s 28s +11%
mlk_keccak_squeezeblocks_x4 27s 28s -4%
mlk_fqmul 16s 14s +14%
keccakf1600x4_permute_native_x4 15s 17s -12%
mlk_poly_decompress_d11_native 15s 14s +7%
mlk_poly_decompress_d5_native 14s 12s +17%
mlk_indcpa_enc_v 12s - new
mlk_polyvec_add 12s 11s +9%
mlk_indcpa_dec 11s 10s +10%
mlk_poly_frommsg 10s 12s -17%
mlk_enc_derand_u 9s - new
mlk_poly_frombytes_native 9s 9s +0%
mlk_keccak_squeeze_once 8s 8s +0%
mlk_ntt_butterfly_block 8s 7s +14%
mlk_poly_ntt 8s 7s +14%
rej_uniform_native_x86_64 8s 7s +14%
mlk_invntt_layer 7s 6s +17%
mlk_keccak_absorb_once_x4 7s 5s +40%
mlk_keccak_squeezeblocks 7s 7s +0%
mlk_keccakf1600_permute_c 7s 3s +133%
poly_decompress_d5_native_x86_64 7s 5s +40%
mlk_polymat_permute_bitrev_to_custom 6s 7s -14%
mlk_polyvec_ntt 6s 5s +20%
kem_dec 5s 4s +25%
mlk_enc_v 5s - new
mlk_gen_matrix_serial 5s 5s +0%
mlk_poly_add 5s 3s +67%
mlk_poly_compress_d11 5s 2s +150%
mlk_poly_compress_d11_c 5s 5s +0%
mlk_poly_getnoise_eta1_4x 5s 3s +67%
mlk_poly_mulcache_compute_c 5s 4s +25%
mlk_poly_rej_uniform_x4 5s 5s +0%
mlk_poly_tomsg 5s 5s +0%
mlk_polyvec_mulcache_compute 5s 3s +67%
poly_decompress_d11_native_x86_64 5s 4s +25%
poly_tomont_native_aarch64 5s 3s +67%
kem_check_pk 4s 3s +33%
kem_enc 4s 4s +0%
mlk_ct_sel_uint8 4s 2s +100%
mlk_keypair_getnoise_eta1 4s 3s +33%
mlk_poly_compress_d10_native 4s 1s +300%
mlk_poly_compress_d11_native 4s 2s +100%
mlk_poly_compress_dv 4s 4s +0%
mlk_poly_decompress_d10_c 4s 3s +33%
mlk_poly_decompress_d11 4s 1s +300%
mlk_poly_decompress_d4 4s 4s +0%
mlk_poly_decompress_d4_c 4s 2s +100%
mlk_poly_decompress_du 4s 3s +33%
mlk_poly_getnoise_eta2 4s 3s +33%
mlk_poly_ntt_c 4s 4s +0%
mlk_poly_reduce_c 4s 2s +100%
mlk_poly_tomont_native 4s 4s +0%
mlk_scalar_compress_d11 4s 2s +100%
mlk_sha3_256 4s 2s +100%
mlk_shake256x4 4s 3s +33%
poly_decompress_d4_native_x86_64 4s 1s +300%
polyvec_basemul_acc_montgomery_cached_k2_native_x86_64 4s 3s +33%
polyvec_basemul_acc_montgomery_cached_k3_native_x86_64 4s 2s +100%
sys_check_capability 4s 3s +33%
keccakf1600_permute_native 3s 3s +0%
kem_keypair 3s 2s +50%
mlk_check_pct 3s 1s +200%
mlk_ct_cmask_neg_i16 3s 1s +200%
mlk_ct_cmov_zero 3s 2s +50%
mlk_deserialize_polyvec_16le 3s - new
mlk_gen_matrix 3s 6s -50%
mlk_keccak_absorb_once 3s 7s -57%
mlk_keccakf1600_extract_bytes (big endian) 3s 3s +0%
mlk_keccakf1600_xor_bytes (big endian) 3s 2s +50%
mlk_keccakf1600x4_extract_bytes_c 3s 2s +50%
mlk_keccakf1600x4_xor_bytes 3s 3s +0%
mlk_matvec_mul 3s 4s -25%
mlk_poly_cbd_eta2 3s 2s +50%
mlk_poly_compress_d10 3s 2s +50%
mlk_poly_compress_d10_c 3s 3s +0%
mlk_poly_compress_d5_c 3s 2s +50%
mlk_poly_decompress_d10_native 3s 4s -25%
mlk_poly_decompress_d11_c 3s 2s +50%
mlk_poly_decompress_d5 3s 2s +50%
mlk_poly_decompress_dv 3s 2s +50%
mlk_poly_getnoise_eta1122_4x 3s 3s +0%
mlk_poly_invntt_tomont 3s 3s +0%
mlk_poly_invntt_tomont_c 3s 3s +0%
mlk_poly_reduce 3s 2s +50%
mlk_poly_tobytes 3s 4s -25%
mlk_poly_tobytes_c 3s 2s +50%
mlk_polyvec_decompress_du 3s 2s +50%
mlk_polyvec_permute_bitrev_to_custom_native 3s 1s +200%
mlk_scalar_compress_d10 3s 1s +200%
mlk_scalar_compress_d5 3s 2s +50%
mlk_scalar_decompress_d4 3s 3s +0%
mlk_scalar_signed_to_unsigned_q 3s 3s +0%
mlk_serialize_epp 3s - new
mlk_shake128_squeezeblocks 3s 4s -25%
poly_compress_d11_native_x86_64 3s 2s +50%
poly_compress_d4_native_x86_64 3s 2s +50%
poly_frombytes_native_x86_64 3s 3s +0%
poly_getnoise_eta1122_4x_native 3s 1s +200%
poly_invntt_tomont_native 3s 3s +0%
poly_mulcache_compute_native_x86_64 3s 2s +50%
poly_reduce_native_x86_64 3s 5s -40%
polyvec_basemul_acc_montgomery_cached_k2_native_aarch64 3s 3s +0%
polyvec_basemul_acc_montgomery_cached_k3_native_aarch64 3s 3s +0%
polyvec_basemul_acc_montgomery_cached_k4_native_aarch64 3s 2s +50%
rej_uniform_native 3s 4s -25%
intt_native_aarch64 2s 2s +0%
keccak_f1600_x1_native_aarch64_v84a 2s 1s +100%
keccak_f1600_x4_native_aarch64_v84a 2s 1s +100%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 3s -33%
keccak_f1600_x4_native_avx2 2s 1s +100%
keccakf1600x4_extract_bytes_native 2s 2s +0%
kem_check_sk 2s 2s +0%
kem_keypair_derand 2s 1s +100%
mlk_barrett_reduce 2s 2s +0%
mlk_ct_cmask_nonzero_u8 2s 2s +0%
mlk_ct_get_optblocker_u32 2s 2s +0%
mlk_ct_get_optblocker_u8 2s 3s -33%
mlk_ct_memcmp 2s 3s -33%
mlk_deserialize_epp 2s - new
mlk_keccakf1600_extract_bytes 2s 2s +0%
mlk_keccakf1600x4_extract_bytes 2s 2s +0%
mlk_keccakf1600x4_permute 2s 2s +0%
mlk_poly_cbd_eta1 2s 2s +0%
mlk_poly_compress_d4 2s 3s -33%
mlk_poly_compress_d5 2s 2s +0%
mlk_poly_compress_d5_native 2s 3s -33%
mlk_poly_compress_du 2s 2s +0%
mlk_poly_decompress_d10 2s 3s -33%
mlk_poly_decompress_d5_c 2s 4s -50%
mlk_poly_frombytes_c 2s 3s -33%
mlk_poly_getnoise_eta1_4x_native 2s 4s -50%
mlk_poly_mulcache_compute_native 2s 2s +0%
mlk_poly_sub 2s 2s +0%
mlk_poly_tomont 2s 3s -33%
mlk_poly_tomont_c 2s 1s +100%
mlk_polyvec_compress_du 2s 2s +0%
mlk_polyvec_frombytes 2s 2s +0%
mlk_polyvec_invntt_tomont 2s 2s +0%
mlk_rej_uniform 2s 3s -33%
mlk_scalar_compress_d4 2s 2s +0%
mlk_sha3_512 2s 2s +0%
mlk_shake128x4_absorb_once 2s 2s +0%
mlk_shake128x4_squeezeblocks 2s 2s +0%
mlk_shake256 2s 1s +100%
mlk_value_barrier_i32 2s 1s +100%
mlk_value_barrier_u8 2s 1s +100%
ntt_native_aarch64 2s 3s -33%
ntt_native_x86_64 2s 3s -33%
nttunpack_native_x86_64 2s 4s -50%
poly_compress_d10_native_x86_64 2s 3s -33%
poly_compress_d5_native_x86_64 2s 4s -50%
poly_mulcache_compute_native_aarch64 2s 3s -33%
poly_reduce_native_aarch64 2s 3s -33%
poly_tobytes_native_aarch64 2s 3s -33%
poly_tobytes_native_x86_64 2s 2s +0%
rej_uniform_native_aarch64 2s 2s +0%
intt_native_x86_64 1s 2s -50%
keccak_f1600_x1_native_aarch64 1s 2s -50%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 1s 2s -50%
keccakf1600x4_xor_bytes_native 1s 3s -67%
kem_enc_derand 1s 4s -75%
mlk_ct_cmask_nonzero_u16 1s 1s +0%
mlk_ct_get_optblocker_i32 1s 2s -50%
mlk_ct_sel_int16 1s 1s +0%
mlk_indcpa_enc 1s 147s -99%
mlk_keccakf1600_permute 1s 2s -50%
mlk_keccakf1600_xor_bytes 1s 1s +0%
mlk_keccakf1600x4_xor_bytes_c 1s 2s -50%
mlk_montgomery_reduce 1s 4s -75%
mlk_poly_compress_d4_c 1s 1s +0%
mlk_poly_compress_d4_native 1s 2s -50%
mlk_poly_decompress_d4_native 1s 1s +0%
mlk_poly_frombytes 1s 1s +0%
mlk_poly_mulcache_compute 1s 1s +0%
mlk_poly_tobytes_native 1s 2s -50%
mlk_polyvec_basemul_acc_montgomery_cached 1s 2s -50%
mlk_polyvec_permute_bitrev_to_custom 1s 3s -67%
mlk_polyvec_reduce 1s 2s -50%
mlk_polyvec_tobytes 1s 2s -50%
mlk_polyvec_tomont 1s 2s -50%
mlk_scalar_compress_d1 1s 3s -67%
mlk_scalar_decompress_d10 1s 2s -50%
mlk_scalar_decompress_d11 1s 4s -75%
mlk_scalar_decompress_d5 1s 3s -67%
mlk_serialize_polyvec_16le 1s - new
mlk_shake128_absorb_once 1s 4s -75%
mlk_value_barrier_u32 1s 2s -50%
poly_decompress_d10_native_x86_64 1s 1s +0%
poly_tomont_native_x86_64 1s 2s -50%
polyvec_basemul_acc_montgomery_cached_k4_native_x86_64 1s 2s -50%

@hanno-becker hanno-becker added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels Mar 13, 2026

@hanno-becker hanno-becker left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of 0a01cc4? Tests also serve as documentation, and using internal constants rather than public ones sets a wrong example.

If this is needed, can it be done in a preparatory PR? It seems unrelated to this PR.

@mkannwischer

mkannwischer commented Mar 13, 2026

Copy link
Copy Markdown
Contributor Author

What's the purpose of 0a01cc4? Tests also serve as documentation, and using internal constants rather than public ones sets a wrong example.

If this is needed, can it be done in a preparatory PR? It seems unrelated to this PR.

The main question here is if we want to add the new API in mlkem_native.h or not. If we don't, we can't test the API in the standard test_mlkem.c, but we could add it in a separate test that includes kem.h, but not mlkem_native.h.
The purpose of 0a01cc4 was to get something to work first, so we can discuss how we want to proceed.

I agree with you that we don't want to keep it as is right now.

@hanno-becker

Copy link
Copy Markdown
Contributor

Seeing that you also observed a slowdown on x86, I wonder if we should treat the incremental API as internal by default and only expose it in the public API if some new option MLK_CONFIG_ENABLE_MLKEM_BRAID it set?

@mkannwischer mkannwischer added the benchmark this PR should be benchmarked in CI label Mar 17, 2026

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Intel Xeon 4th gen (c7i)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: a4e4e31 Previous: 2bf8e59 Ratio
ML-KEM-1024 decaps 40620 cycles 39396 cycles 1.03

This comment was automatically generated by workflow using github-action-benchmark.

@hanno-becker hanno-becker added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels Mar 17, 2026
@mkannwischer mkannwischer force-pushed the incremental-enc-api branch 2 times, most recently from 4f0ace1 to 732adb5 Compare May 7, 2026 05:35
@mkannwischer mkannwischer added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels May 7, 2026

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mac Mini (M1, 2020) benchmarks

Details
Benchmark suite Current: e9ea411 Previous: 4482a37 Ratio
ML-KEM-512 keypair 12318 cycles 12320 cycles 1.00
ML-KEM-512 encaps 14811 cycles 14763 cycles 1.00
ML-KEM-512 decaps 19363 cycles 19318 cycles 1.00
ML-KEM-768 keypair 21268 cycles 21269 cycles 1.00
ML-KEM-768 encaps 23524 cycles 23513 cycles 1.00
ML-KEM-768 decaps 30069 cycles 30057 cycles 1.00
ML-KEM-1024 keypair 30332 cycles 30328 cycles 1.00
ML-KEM-1024 encaps 34141 cycles 34094 cycles 1.00
ML-KEM-1024 decaps 43755 cycles 43708 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A55 (Snapdragon 888) benchmarks

Details
Benchmark suite Current: e9ea411 Previous: 4482a37 Ratio
ML-KEM-512 keypair 59809 cycles 59789 cycles 1.00
ML-KEM-512 encaps 67442 cycles 67420 cycles 1.00
ML-KEM-512 decaps 86474 cycles 86424 cycles 1.00
ML-KEM-768 keypair 97438 cycles 97457 cycles 1.00
ML-KEM-768 encaps 110852 cycles 110756 cycles 1.00
ML-KEM-768 decaps 137951 cycles 138252 cycles 1.00
ML-KEM-1024 keypair 154756 cycles 154707 cycles 1.00
ML-KEM-1024 encaps 171127 cycles 171167 cycles 1.00
ML-KEM-1024 decaps 207595 cycles 207712 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A72 (Raspberry Pi 4) benchmarks

Details
Benchmark suite Current: e9ea411 Previous: 4482a37 Ratio
ML-KEM-512 keypair 51223 cycles 50942 cycles 1.01
ML-KEM-512 encaps 59000 cycles 58869 cycles 1.00
ML-KEM-512 decaps 75420 cycles 74924 cycles 1.01
ML-KEM-768 keypair 86737 cycles 85776 cycles 1.01
ML-KEM-768 encaps 95064 cycles 93464 cycles 1.02
ML-KEM-768 decaps 118094 cycles 117436 cycles 1.01
ML-KEM-1024 keypair 130563 cycles 129378 cycles 1.01
ML-KEM-1024 encaps 143127 cycles 141713 cycles 1.01
ML-KEM-1024 decaps 174056 cycles 174402 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot oqs-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SpacemiT K1 8 (Banana Pi F3) benchmarks

Details
Benchmark suite Current: e9ea411 Previous: 4482a37 Ratio
ML-KEM-512 keypair 143044 cycles 143036 cycles 1.00
ML-KEM-512 encaps 151348 cycles 151397 cycles 1.00
ML-KEM-512 decaps 189670 cycles 189743 cycles 1.00
ML-KEM-768 keypair 233070 cycles 233084 cycles 1.00
ML-KEM-768 encaps 250095 cycles 250290 cycles 1.00
ML-KEM-768 decaps 304725 cycles 304913 cycles 1.00
ML-KEM-1024 keypair 365594 cycles 365632 cycles 1.00
ML-KEM-1024 encaps 387752 cycles 388994 cycles 1.00
ML-KEM-1024 decaps 457640 cycles 459309 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@mkannwischer mkannwischer force-pushed the incremental-enc-api branch from 1ce787b to a4e4e31 Compare May 7, 2026 06:44
@mkannwischer mkannwischer added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels May 7, 2026
@mkannwischer mkannwischer force-pushed the incremental-enc-api branch from a4e4e31 to 856b540 Compare May 24, 2026 07:13
@mkannwischer mkannwischer added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels May 24, 2026
@mkannwischer mkannwischer force-pushed the incremental-enc-api branch 3 times, most recently from ab8f4bf to 37d5620 Compare June 14, 2026 14:14
@rod-chapman rod-chapman force-pushed the incremental-enc-api branch 2 times, most recently from d488734 to a51ea38 Compare June 17, 2026 14:34
mkannwischer and others added 2 commits June 18, 2026 10:57
Split K-PKE.Encrypt and ML-KEM.Encaps into two phases (u and v) to
support protocols like MLKEMBraid that transmit large KEM components
in parallel over bandwidth-constrained channels.

CPA level (indcpa):
- mlk_indcpa_enc_u: computes ct_u from ek_seed, outputs intermediate
  state (sp, epp, sp_cache)
- mlk_indcpa_enc_v: computes ct_v from ek_vector using intermediate
  state from enc_u

CCA KEM level (kem):
- mlk_kem_enc_derand_u: FO transform + enc_u, outputs shared secret
  and intermediate state; only needs ek_seed and H(pk)
- mlk_kem_enc_v: modulus check on ek_vector + enc_v; only needs
  ek_vector

epp is serialized as 4-bit nibbles (ETA2 - x) to provide a natural
coefficient bound on deserialization; sp is serialized as 16-bit LE.
The shared sp mulcache is computed once and threaded through enc_u/enc_v.

Includes CBMC contracts and proofs for the new functions, the
MLK_CONFIG_ENABLE_MLKEM_BRAID configuration option exposing the API,
recomputed peak stack consumption values, and OpenTitan work buffer
size updates.

The test verifies that the incremental API produces identical
ciphertexts and shared secrets as the standard API across all three
parameter sets.

Co-authored-by: Hanno Becker <beckphan@amazon.co.uk>
Signed-off-by: Matthias J. Kannwischer <matthias@zerorisc.com>
Signed-off-by: Rod Chapman <rodchap@amazon.com>
Signed-off-by: Rod Chapman <rodchap@amazon.com>
@rod-chapman rod-chapman force-pushed the incremental-enc-api branch from a1672a8 to e9ea411 Compare June 18, 2026 09:58
@hanno-becker hanno-becker added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

benchmark this PR should be benchmarked in CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants